منابع مشابه
A methodology for topographic clustering of structured text documents
Sets of texts are structured through a more or less refined hierarchy of sections, subsections and paragraphs; this structure contains information that should be exploited to handle these data and in particular, to enrich the comparison of texts, as a complement to the vector description of their contents. We propose a kernel-based methodology that follows this principle for a topographic clust...
متن کاملA Joint Semantic Vector Representation Model for Text Clustering and Classification
Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...
متن کاملThe Comparison of SOM and K-means for Text Clustering
SOM and k-means are two classical methods for text clustering. In this paper some experiments have been done to compare their performances. The sample data used is 420 articles which come from different topics. K-means method is simple and easy to implement; the structure of SOM is relatively complex, but the clustering results are more visual and easy to comprehend. The comparison results also...
متن کاملText Clustering Algorithms: A Review
With the growth of Internet, large amount of text data is increasing, which are created by different media like social networking sites, web, and other informatics sources, etc. This data is in unstructured format which makes it tedious to analyze it, so we need methods and algorithms which can be used with various types of text formats. Clustering is an important part of the data mining. Clust...
متن کاملText Clustering Exploration Swedish Text Representation and Clustering Results Unraveled
Text clustering divides a set of texts into clusters (parts), so that texts within each cluster are similar in content. It may be used to uncover the structure and content of unknown text sets as well as to give new perspectives on familiar ones. The main contributions of this thesis are an investigation of text representation for Swedish and some extensions of the work on how to use text clust...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Computer Applications
سال: 2016
ISSN: 0975-8887
DOI: 10.5120/ijca2016909515